Sparse optimization on measures with over-parameterized gradient descent

نویسندگان

چکیده

Minimizing a convex function of measure with sparsity-inducing penalty is typical problem arising, e.g., in sparse spikes deconvolution or two-layer neural networks training. We show that this can be solved by discretizing the and running non-convex gradient descent on positions weights particles. For measures d-dimensional manifold under some non-degeneracy assumptions, leads to global optimization algorithm complexity scaling as $$\log (1/\epsilon )$$ desired accuracy $$\epsilon $$ , instead ^{-d}$$ for methods. The key theoretical tools are local convergence analysis Wasserstein space an perturbed mirror measures. Our bounds involve quantities exponential d which unavoidable our assumptions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse Communication for Distributed Gradient Descent

We make distributed stochastic gradient descent faster by exchanging sparse updates instead of dense updates. Gradient updates are positively skewed as most updates are near zero, so we map the 99% smallest updates (by absolute value) to zero then exchange sparse matrices. This method can be combined with quantization to further improve the compression. We explore different configurations and a...

متن کامل

Dictionary Learning with Large Step Gradient Descent for Sparse Representations

This work presents a new algorithm for dictionary learning. Existing algorithms such as MOD and K-SVD often fail to find the best dictionary because they get trapped in a local minimum. Olshausen and Field’s Sparsenet algorithm relies on a fixed step projected gradient descent. With the right step, it can avoid local minima and converge towards the global minimum. The problem then becomes to fi...

متن کامل

Mutiple-gradient Descent Algorithm for Multiobjective Optimization

The steepest-descent method is a well-known and effective single-objective descent algorithm when the gradient of the objective function is known. Here, we propose a particular generalization of this method to multi-objective optimization by considering the concurrent minimization of n smooth criteria {J i } (i = 1,. .. , n). The novel algorithm is based on the following observation: consider a...

متن کامل

Simple2Complex: Global Optimization by Gradient Descent

A method named simple2complex for modeling and training deep neural networks is proposed. Simple2complex train deep neural networks by smoothly adding more and more layers to the shallow networks, as the learning procedure going on, the network is just like growing. Compared with learning by end2end, simple2complex is with less possibility trapping into local minimal, namely, owning ability for...

متن کامل

Convex Optimization : A Gradient Descent Approach

This algorithm is introduced by [Zin]. Assuming {lt(·)}t=1 are differentiable: Starting with some arbitrary x1 ∈ X. For t = 1, ...,T • zt+1← xt − η 5 lt(xt). • xt+1←ΠX(zt+1), where ΠX(·) is the projection function. The performance of OGD is described as follows. Theorem .. If there exists some positive constant G,D such that || 5 lt(x)||2 ≤ G,∀t,x ∈ X (.) ||X ||2;= max x,y∈X ||x − y||2 ≤D...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematical Programming

سال: 2021

ISSN: ['0025-5610', '1436-4646']

DOI: https://doi.org/10.1007/s10107-021-01636-z